38 research outputs found

    Fast filtering and animation of large dynamic networks

    Full text link
    Detecting and visualizing what are the most relevant changes in an evolving network is an open challenge in several domains. We present a fast algorithm that filters subsets of the strongest nodes and edges representing an evolving weighted graph and visualize it by either creating a movie, or by streaming it to an interactive network visualization tool. The algorithm is an approximation of exponential sliding time-window that scales linearly with the number of interactions. We compare the algorithm against rectangular and exponential sliding time-window methods. Our network filtering algorithm: i) captures persistent trends in the structure of dynamic weighted networks, ii) smoothens transitions between the snapshots of dynamic network, and iii) uses limited memory and processor time. The algorithm is publicly available as open-source software.Comment: 6 figures, 2 table

    Distinguishing Topical and Social Groups Based on Common Identity and Bond Theory

    Full text link
    Social groups play a crucial role in social media platforms because they form the basis for user participation and engagement. Groups are created explicitly by members of the community, but also form organically as members interact. Due to their importance, they have been studied widely (e.g., community detection, evolution, activity, etc.). One of the key questions for understanding how such groups evolve is whether there are different types of groups and how they differ. In Sociology, theories have been proposed to help explain how such groups form. In particular, the common identity and common bond theory states that people join groups based on identity (i.e., interest in the topics discussed) or bond attachment (i.e., social relationships). The theory has been applied qualitatively to small groups to classify them as either topical or social. We use the identity and bond theory to define a set of features to classify groups into those two categories. Using a dataset from Flickr, we extract user-defined groups and automatically-detected groups, obtained from a community detection algorithm. We discuss the process of manual labeling of groups into social or topical and present results of predicting the group label based on the defined features. We directly validate the predictions of the theory showing that the metrics are able to forecast the group type with high accuracy. In addition, we present a comparison between declared and detected groups along topicality and sociality dimensions.Comment: 10 pages, 6 figures, 2 table

    Complex networks approach to modeling online social systems. The emergence of computational social science

    Get PDF
    This thesis is devoted to quantitative description, analysis, and modeling of complex social systems in the form of online social networks. Statistical patterns of the systems under study are unveiled and interpreted using concepts and methods of network science, social network analysis, and data mining. A long-term promise of this research is that predicting the behavior of complex techno-social systems will be possible in a way similar to contemporary weather forecasting, using statistical inference and computational modeling based on the advancements in understanding and knowledge of techno-social systems. Although the subject of this study are humans, as opposed to atoms or molecules in statistical physics, the availability of extremely large datasets on human behavior permits the use of tools and techniques of statistical physics. This dissertation deals with large datasets from online social networks, measures statistical patterns of social behavior, and develops quantitative methods, models, and metrics for complex techno-social systemsLa presente tesis está dedicada a la descripción, análisis y modelado cuantitativo de sistemas complejos sociales en forma de redes sociales en internet. Mediante el uso de métodos y conceptos provenientes de ciencia de redes, análisis de redes sociales y minería de datos se descubren diferentes patrones estadísticos de los sistemas estudiados. Uno de los objetivos a largo plazo de esta línea de investigación consiste en hacer posible la predicción del comportamiento de sistemas complejos tecnológico-sociales, de un modo similar a la predicción meteorológica, usando inferencia estadística y modelado computacional basado en avances en el conocimiento de los sistemas tecnológico-sociales. A pesar de que el objeto del presente estudio son seres humanos, en lugar de los átomos o moléculas estudiados tradicionalmente en la física estadística, la disponibilidad de grandes bases de datos sobre comportamiento humano hace posible el uso de técnicas y métodos de física estadística. En el presente trabajo se utilizan grandes bases de datos provenientes de redes sociales en internet, se miden patrones estadísticos de comportamiento social, y se desarrollan métodos cuantitativos, modelos y métricas para el estudio de sistemas complejos tecnológico-sociales

    Resilience of Supervised Learning Algorithms to Discriminatory Data Perturbations

    Full text link
    Discrimination is a focal concern in supervised learning algorithms augmenting human decision-making. These systems are trained using historical data, which may have been tainted by discrimination, and may learn biases against the protected groups. An important question is how to train models without propagating discrimination. In this study, we i) define and model discrimination as perturbations of a data-generating process and show how discrimination can be induced via attributes correlated with the protected attributes; ii) introduce a measure of resilience of a supervised learning algorithm to potentially discriminatory data perturbations, iii) propose a novel supervised learning algorithm that inhibits discrimination, and iv) show that it is more resilient to discriminatory perturbations in synthetic and real-world datasets than state-of-the-art learning algorithms. The proposed method can be used with general supervised learning algorithms and avoids inducement of discrimination, while maximizing model accuracy.Comment: 17 pages, 10 figures, 1 tabl

    Estimating community feedback effect on topic choice in social media with predictive modeling

    Get PDF
    Social media users post content on various topics. A defining feature of social media is that other users can provide feedback—called community feedback—to their content in the form of comments, replies, and retweets. We hypothesize that the amount of received feedback influences the choice of topics on which a social media user posts. However, it is challenging to test this hypothesis as user heterogeneity and external confounders complicate measuring the feedback effect. Here, we investigate this hypothesis with a predictive approach based on an interpretable model of an author’s decision to continue the topic of their previous post. We explore the confounding factors, including author’s topic preferences and unobserved external factors such as news and social events, by optimizing the predictive accuracy. This approach enables us to identify which users are susceptible to community feedback. Overall, we find that 33% and 14% of active users in Reddit and Twitter, respectively, are influenced by community feedback. The model suggests that this feedback alters the probability of topic continuation up to 14%, depending on the user and the amount of feedback

    Demographic Inference and Representative Population Estimates from Multilingual Social Media Data

    Get PDF
    Social media provide access to behavioural data at an unprecedented scale and granularity. However, using these data to understand phenomena in a broader population is difficult due to their non-representativeness and the bias of statistical inference tools towards dominant languages and groups. While demographic attribute inference could be used to mitigate such bias, current techniques are almost entirely monolingual and fail to work in a global environment. We address these challenges by combining multilingual demographic inference with post-stratification to create a more representative population sample. To learn demographic attributes, we create a new multimodal deep neural architecture for joint classification of age, gender, and organization-status of social media users that operates in 32 languages. This method substantially outperforms current state of the art while also reducing algorithmic bias. To correct for sampling biases, we propose fully interpretable multilevel regression methods that estimate inclusion probabilities from inferred joint population counts and ground-truth population counts. In a large experiment over multilingual heterogeneous European regions, we show that our demographic inference and bias correction together allow for more accurate estimates of populations and make a significant step towards representative social sensing in downstream applications with multilingual social media

    Social features of online networks: the strength of intermediary ties in online social media

    Get PDF
    An increasing fraction of today social interactions occur using online social media as communication channels. Recent worldwide events, such as social movements in Spain or revolts in the Middle East, highlight their capacity to boost people coordination. Online networks display in general a rich internal structure where users can choose among different types and intensity of interactions. Despite of this, there are still open questions regarding the social value of online interactions. For example, the existence of users with millions of online friends sheds doubts on the relevance of these relations. In this work, we focus on Twitter, one of the most popular online social networks, and find that the network formed by the basic type of connections is organized in groups. The activity of the users conforms to the landscape determined by such groups. Furthermore, Twitter's distinction between different types of interactions allows us to establish a parallelism between online and offline social networks: personal interactions are more likely to occur on internal links to the groups (the weakness of strong ties), events transmitting new information go preferentially through links connecting different groups (the strength of weak ties) or even more through links connecting to users belonging to several groups that act as brokers (the strength of intermediary ties).Comment: 14 pages, 18 figure

    Complex Networks approach to modeling online social systems: The emergence of computational social science

    Get PDF
    Tesis doctoral presentada por Przemyslaw A. Grabowicz para optar al tĂ­tulo de Doctor, en el Programa de FĂ­sica del Departamento de FĂ­sica de la Universitat de les Illes Balears, realizada en el IFISC.This thesis is devoted to quantitative description, analysis, and modeling of complex social systems in the form of online social networks. Statistical patterns of the systems under study are unveiled and interpreted using concepts and methods of network science, social network analysis, and data mining. A long-term promise of this research is that predicting the behavior of complex techno-social systems will be possible in a way similar to contemporary weather forecasting, using statistical inference and computational modeling based on the advancements in understanding and knowledge of techno-social systems. Although the subject of this study are humans, as opposed to atoms or molecules in statistical physics, the availability of extremely large datasets on human behavior permits the use of tools and techniques of statistical physics. This dissertation deals with large datasets from online social networks, measures statistical patterns of social behavior, and develops quantitative methods, models, and metrics for complex techno-social systems.This dissertation has been developed thanks to the support of CSIC JAE Predoc program and research projects of the Institute of Interdisciplinary Physics and Complex Systems in Palma de Mallorca.Peer Reviewe

    Complex networks approach to modeling online social systems. The emergence of computational social science

    Get PDF
    This thesis is devoted to quantitative description, analysis, and modeling of complex social systems in the form of online social networks. Statistical patterns of the systems under study are unveiled and interpreted using concepts and methods of network science, social network analysis, and data mining. A long-term promise of this research is that predicting the behavior of complex techno-social systems will be possible in a way similar to contemporary weather forecasting, using statistical inference and computational modeling based on the advancements in understanding and knowledge of techno-social systems. Although the subject of this study are humans, as opposed to atoms or molecules in statistical physics, the availability of extremely large datasets on human behavior permits the use of tools and techniques of statistical physics. This dissertation deals with large datasets from online social networks, measures statistical patterns of social behavior, and develops quantitative methods, models, and metrics for complex techno-social systems.La presente tesis está dedicada a la descripción, análisis y modelado cuantitativo de sistemas complejos sociales en forma de redes sociales en internet. Mediante el uso de métodos y conceptos provenientes de ciencia de redes, análisis de redes sociales y minería de datos se descubren diferentes patrones estadísticos de los sistemas estudiados. Uno de los objetivos a largo plazo de esta línea de investigación consiste en hacer posible la predicción del comportamiento de sistemas complejos tecnológico-sociales, de un modo similar a la predicción meteorológica, usando inferencia estadística y modelado computacional basado en avances en el conocimiento de los sistemas tecnológico-sociales. A pesar de que el objeto del presente estudio son seres humanos, en lugar de los átomos o moléculas estudiados tradicionalmente en la física estadística, la disponibilidad de grandes bases de datos sobre comportamiento humano hace posible el uso de técnicas y métodos de física estadística. En el presente trabajo se utilizan grandes bases de datos provenientes de redes sociales en internet, se miden patrones estadísticos de comportamiento social, y se desarrollan métodos cuantitativos, modelos y métricas para el estudio de sistemas complejos tecnológico-sociales
    corecore